Dimensionality Reduction for Nonlinear Regression with Two Predictor Vectors

نویسندگان

  • Yanjun Li
  • Yoram Bresler
چکیده

Many variables that we would like to predict depend nonlinearly on two types of attributes. For example, prices are influenced by supply and demand. Movie ratings are determined by demographic attributes and genre attributes. This paper addresses the dimensionality reduction problem in such regression problems with two predictor vectors. In particular, we assume a discriminative model where low-dimensional linear embeddings of the two predictor vectors are sufficient statistics for predicting a dependent variable. We show that a simple algorithm involving singular value decomposition can accurately estimate the embeddings provided that certain sample complexities are satisfied, surprisingly, without specifying the nonlinear regression model. These embeddings improve the efficiency and robustness of subsequent training, and can serve as a pre-training algorithm for neural networks. The main results establish sample complexities under multiple settings. Sample complexities for different regression models only differ by constant factors.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Localized regression on principal manifolds

We consider nonparametric dimension reduction techniques for multivariate regression problems in which the variables constituting the predictor space are strongly nonlinearly related. Specifically, the predictor space is approximated via “local” principal manifolds, based on which a kernel regression is carried out.

متن کامل

Dimensionality Reduction for Language A Survey of Dimensionality Reduction Techniques for Natural Language

Machine learning methods for natural language use features consisting of words or combinations of words to fit statistical models of linguistic phenomena. The discrete input spaces resulting from these features often have hundreds of thousands or millions of dimensions, and estimating reliable statistics of these features from limited amounts of training data is difficult. One technique for all...

متن کامل

A minimum discrepancy approach to multivariate dimension reduction via k-means inverse regression

We proposed a new method to estimate the intra-cluster adjusted central subspace for regressions with multivariate responses. Following Setodji and Cook (2004), we made use of the k-means algorithm to cluster the observed response vectors. Our method was designed to recover the intracluster information and outperformed previous method with respect to estimation accuracies on both the central su...

متن کامل

Using Manifold Learning for Nonlinear System Identifi- cation

A high-dimensional regression space usually causes problems in nonlinear system identification. However, if the regression data are contained in (or spread tightly around) some manifold, the dimensionality can be reduced. This paper presents a use of dimension reduction techniques to compose a two-step identification scheme suitable for high-dimensional identification problems with manifold-val...

متن کامل

Sliced Coordinate Analysis for Effective Dimension Reduction and Nonlinear Extensions

Sliced inverse regression (SIR) is an important method for reducing the dimensionality of input variables. Its goal is to estimate the effective dimension reduction directions. In classification settings, SIR is closely related to Fisher discriminant analysis. Motivated by reproducing kernel theory, we propose a notion of nonlinear effective dimension reduction and develop a nonlinear extension...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • CoRR

دوره abs/1602.04398  شماره 

صفحات  -

تاریخ انتشار 2016